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Abstract 

We propose an efficient method to solve the eigenproblem of N x N Symmetric 
Tridiagonal (ST) matrices. Unlike the standard eigensolvers which necessitate 0(N 3 ) 
operations to compute the eigenvectors of such ST matrices, the proposed method 
computes both the eigenvalues and eigenvectors with only 0(N 2 ) operations. The 
method is based on serial implementation of the recently introduced Divide and Con- 
quer (DC) algorithm [3],[l],[4]. It exploits the fact that by 0(N 2 ) of DC operations, 
one can compute the eigenvalues of N X N ST matrix and a finite number of pairs of 
successive rows of its eigenvector matrix. The rest of the eigenvectors-all of them or 
one at the time, are computed by linear three-term recurrence relations. We conclude 
with numerical examples, which demonstrate the superiority of the proposed method 
by saving an order of magnitude in execution time at the expense of sacrificing a few 
orders of accuracy. 
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1. INTRODUCTION 


The QR algorithm computes the eigenvalues of an N x N Symmetric Tridiagonal (ST) 
matrix with 0(N 2 ) operations, while the corresponding eigenvector matrix is accumulated 
during the algorithm at the expense of 0(N 3 ) operations. The additional order of mag- 
nitude required to compute the eigenvectors is typical of serial algorithms. A complete 
0(N 2 ) eigensolver can be obtained by appending such serial algorithms with the Inverse 
Iteration (INVIT) method. Indeed, 0{N ) operations of only one INVIT will suffice to 
accurately compute each eigenvector corresponding to an isolated eigenvalue [8, Chapter 
4]. In case of clustered eigenvalues, however, the INVIT requires a more carefully chosen 
initialization, in order to avoid the loss of mutual orthogonality between the corresponding, 
closely “related”, eigenvectors. 1 

Recently, a parallel Divide and Conquer algorithm was introduced for computing the 
spectral decomposition of ST matrices [3],[1],[4]. A serial implementation of this algo- 
rithm, described in Section 2, requires the same number of operations. Namely, the eigen- 
values - which coincide with the roots of the so-called secular equation [6], are computed 
at the expense of no more than 0(N 2 ) sequential operations, while the associated eigen- 
vectors necessitate 0(N 3 ) sequential operations. As before, the INVIT- taken with the 
necessary precautions, is available here as an 0{N 2 ) method to compute these eigenvec- 
tors. In Sections 3 and 4 we propose an alternative efficient method, derived from (and 
therefore better suits) the DC algorithm, which computes the eigensystem of N x N ST 
matrices with only 0{N 2 ) sequential operations. The method employs linear three term 
recurrence relations which successively compute the rows of the eigenvector matrix (or the 
components of each of the desired eigenvectors). The coefficients of these relations depend 
on the already computed eigenvalues, and the method hinges on the fact that the initial 
first two rows (or components) for the recurrence relations emerge naturally from the DC 
computation of these eigenvalues. Thus, the input data for the recurrence relations depend 
1 We thank Professor Beresford Parlett for an enlightening discussion on this issue. 
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solely on the 0(N 2 ) operations for the DC calculation of the eigenvalues. Together with 
the additional 0(N 2 ) operations required to carry out these relations, we end up with an 
efficient 0(N 2 ) method to compute the whole eigensystem of ST matrices. It should be 
emphasized that the advantages of the DC algorithm are retained in our case. That is, we 
have a method which on the one hand is well suited to exploit parallelism, while on the 
other hand, even when run in serial mode on large problems, the method is faster than 
the previously best sequential algorithms, e.g., [3], [4]. 

Due to the sensitivity of the three term recurrence relations, their input data should 
be provided with high accuracy. To achieve this, we employ in Section 5 an improved root 
finder - interesting for its own sake - in order to solve the secular equation mentioned 
above. Numerical examples which demonstrate the efficiency as well as the limitations of 
the proposed method are presented in Section 6. 
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2. THE DIVIDE AND CONQUER ALGORITHM - AN OVERVIEW 

Let D n be an N X N diagonal matrix and let D N + cz N z t N be a Rank One Modification 
(ROM) of this matrix by a unit iV-vector z^. 2 The spectral decomposition of such ROM 
matrices is in the heart of the Divide and Conquer (DC) algorithm. Here we note that the 
problem of finding the spectral decomposition of an JV -dimensional ROM matrix, the so- 
called updating problem , can be solved at the expense of no more than Const. N 2 operations 
[l],[3],[4j. Details of this solution are discussed in Section 4. 

With this in mind we now turn to consider the eigenproblem of general N x N Sym- 
metric Tridiagonal (ST) matrices 

tn *12 
*21 *22 


T n = 


*mm 

*m+l,m. 


*mm+l 


*i; — tji 


tN- 1,N 

*W,AT -1 *AW J 

We can assume without restriction that N = 2m is even, and that 2V is already given 
in its unreduced form, i.e., t,- , + i ^ 0, 1 < * < N — 1, for otherwise Tjy is decoupled into 
smaller unreduced ST matrices. Then, we can split Tn into the sum of 


T n = 


*11 

*12 


*22 

* 



tmm-P ' 

0 


0 • 

*ro+l,ro+l ft 


• 

tN,N 


+ ft 


0 • 0 
1 1 

’ll’’ 
0 • 0 


ft — *m,m+l > 


i.e., 


T n = 


j'( 1 ) 
■L N 


zi 2) 


+ ftb N b* N , b N = + eST +1) 


-(i) 

N 

2 2 
term of these two blocks. 


where the blocks T^ 1 and T$ are rx| ST matrices and ft = * m , m+ i ^ 0 is the coupling 


N ' . N 


throughout the paper, vectors and matrices will be used with a subscript index denoting their dimension. 
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The DC algorithm [3][l][4] is based on the fact that in order to solve the eigenproblem 
of iV-dimensional ST matrices, it is sufficient to solve this problem for y-dimensional ST 
matrices. Specifically, if 


PjjW' = Is 


( 2 . 2 ) 


'T'W _ pW A (!) pi 1 )* 

-L JV r N ? 

2 2 2 2 

nr»( 2 ) — p( 2 ) a ( 2 ) p( 2 ) t p( 2 )p( 2 )* — r 

-L 5 * E. ^ E. “ > 

2 2 2 2 2 2 2 


are the spectral decompositions of the - Xy ST matrices TV and TV respectively, then 

2 2 

we can compute the spectral decomposition of the JV x N ST matrix, Tn, by the following 
procedure: 


I. First, we evaluate the unit iV-vector zn, 


(2.3 a) 


zn = 


1 

v/2 


P [ s ] 

T 




d( 2 ) 

* N 


bN , bs = 


0 

1 

1 

0 


so that by (2.1), (2.2) and (2.3a), Tn is unitarily similar to the ROM matrix 
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pPaSW* 

T T T 


p( 2 ) A ( 2 ) p( 2 )* 

r N jv n r N 
T T T J 


+ Pbtfbpf — 
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T J 
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T J 


(Dn + 2 flZNZff) 


rH 11 i 
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*0 

i 

5 

D n = 

' A W 

JX N 

L T 


II. Second, we solve the updating problem - we find the spectral decomposition of the 
ROM matrix 


(2.36) 


Dn + oznz'n — Qn^nQn > QnQ*n ~ — 2/?. 
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III. Finally, we compute the unitary matrix 

rH 1} 

(2.3c) Ps — 


p( 2 ) 
r N . 
T J 


Qn , 


and obtain, by (2.3b) and (2.3c), the spectral decomposition of T N as 


T n = 




p( 2) , 

MP J 


QnA-nQn 


r P { » ] 


( 2 ) 


T 


= PnA-nPn > PnPn = In • 


This process can be applied recursively: the iV-dimensional eigenproblem of TV is solved in 

terms of two independent y-dimensional eigenproblems of and T^_\ which in turn are 

2 2 

solved in terms of four independent y-dimensional eigenproblems of T^, T^\ T$, T\?\ 

etc. Thus, the DC algorithm for an N = 2"-dimensional ST matrix, TV, is organized as 

follows. After n — 1 splitting steps we are left with 2 n-2 pairs of 2 x 2 <ST matrices. In the 

first iteration they are used to construct, with the help of (2.3a)-(2.3c), the eigensystem 

of 2 n-s pairs of 4 x 4 ST matrices; in the second iteration, one constructs the eigensystem 

of 2 n_4 pairs of 8 x 8 ST matrices, etc.; after n — 2 such iterations we end up with the 

eigensy stems of the pair T^ , T $ , and the last n—1 iteration solves the eigenproblem of T V* 

2 2 

A sequential implementation of a typical A;-th iteration consists of 2 n ~ k ~ 1 times, evaluating 
the 2 k+1 -dimensional unit vectors z in the first stage, (2.3a), solving 2* + ^dimensional ROM 
eigenproblems in the second stage, (2.3b), and computing 2* +1 -dimensional products of 
unitary matrices in the third stage, (2.3c). 

The total amount of work spent on the first two stages, (2.3a) and (2.3b), of all itera- 
tions, does not exceed 2Const.iV 2 ; the total work required for computing the eigenvectors 
in (2.3c) is | N 3 . Thus, the total operations cost of the DC algorithm for finding the eigen- 
system (both the eigenvalues and eigenvectors) of an N x N ST matrix is |iV r3 -(-2Const.iV 2 . 

If only the eigenvalues are required, then we can do better by saving the 0(N 3 ) opera- 
tions required to compute the eigenvectors in the third stage (2.3c). Instead, the first stage 
of a typical A;-th iteration, which requires 2 n- * -1 different evaluations of 2* + ^dimensional 
unit vectors of the form 




v/2 


r pW 

2 * 


1* 


p ( 2 ) 


^2*+* 9 
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can be efficiently implemented as folllows: according to (2.3c), is represented by a 
successive product of 






j = k, k - 1 . . . 1, 


where QiJ were found by spectral decompositions of ROM matrices in previous iterations; 


-u 


similarly, pjfl is represented by a successive product of 


r«r +1) 


<?r +,) 


j = k,k — 1 . . . 1 . 


Hence, we can evaluate each of the 2" * 1 different vectors, z 2 t+i , as z 2 *+i = 2 2 t+i , where 


(2.4a) z$ + x 


U) - 


«<!> 






( 2 t+i > J 1,2 ... fc , ^ 2 t+i — .6 2 *+i 1 


at the expense of |4 t+1 operations. The total work spent on the first stages in all iterations 
is therefore < | N 2 . This is complemented with the solution of 2" - * -1 different updating 
problems, see (2.3b) 


(2.46) 


■D 2 t+i + azjk+iz^k+i = Q 2 *+i A 2 *+iQ 2 t+i . 


The total work spent on the second stage in all iterations amounts to 2Const.iV 2 . Con- 
sequently, the total operations cost of the DC algorithm, (2.4a), (2.4b), for finding the 
eigenvalues of an N x N ST matrix is (2Const. + | )N 2 . 
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3. AN 0(N 2 ) METHOD FOR THE EIGENSYSTEM OF N x N ST MATRIX 

Given an N x N ST matrix, Tat, we can compute its eigenvalues by the DC algorithm 
(2.4a), (2.4b) at the expense of no more than 0(N 2 ) operations. 3 Thus, it remains to 
compute efficiently, i.e., with 0(N 2 ) operations, the eigenvectors of this matrix. To this 
end we may proceed as follows. 

We seek for the unitary matrix P N , PnPn ~ > which diagonalizes T N , 

(3.1) Tn = PnA-nPn • 

Let = Pflf denote the t-th row vector of Pjy. Equating the t-th rows of 


TnPn — Pn^n 


we obtain, in view of the reduced tridiagonal structure of T N , 

( 3 * 2 ) ^ + U,i+xP^N ^ = PaT-^JVj U,i±l 7^ 0 • 

Equation (3.2) is a linear three-term recurrence relation between the rows, p$, of P^, 
whose coefficients are determined by the entries of T N . The input data required in order 
to solve these relations uniquely, consists of 


1. The eigenvalues A N — diag(A^, A^ 2 ) . . . A^) of T N , which determine the terms 
Pjv A at = (A^)p,i, X^pa . . . ^ n ^Pin) on. the right of (3.2). The eigenvalues are com- 
puted by the DC algorithm (2.4a), (2.4b) with (2Const. + | )N 2 operations. 

2. Two successive rows of Pat, which will serve as initial data for the recursive three- term 
relations (3.2). The proposed method hinges on the observation that two such rows 
emerge naturally from that part of the DC algorithm (2.4a), (2.4b) which computes 
the eigenvalues of T N . Indeed, from the last n — 1 iteration of (2.4a) we have at our 
disposal the unit TV-vector, zn, which according to (2.3a) satisfies 

3 In fact, as observed by Cuppen [3], this number of operations can be substantially reduced - up to 
0{NlogN) operations, in practical cases which employ sufficiently many deflations. 
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(3.3) 


= y/2- z N 


1 

S*i« 
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' — <■ , 
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p( 2 ) 

A n 

2 J 


1 

0 

5 

4? 

L 2 J 


Hence z$ and z$ are in fact the last and first column vectors of and Pffl* respec- 

T T T T 

tively. Put differently, (-z*\ On )‘ and (On,z$V are rows number m = ^ and m + 1 of 

2 2 2-5“ * 

r pw i 

r tL 

2 . Consequently, equating the m and m + 1 rows of (2.3c) we obtain the two 

P N 

initial successive rows as 

(3.4) 


pST 1 = (4P,0»)‘Q K , 

2 3 

p!T + 1) = (0 # ,*g)*^ ; 


the remaining rows of P N are computed recursively by (3.2) 


(3.5a) ^ — - [p"(A N — ^t/jv) — ^1 , 

W,i+1 

(3.56) pfc _1) = — [pSv (Aat - UjIn) ~ <m+iPw +1) ] , 


i = m + l...N — 1 , 


t = m, m — 1 . . . 2 . 


The operations cost of (3.4)-(3.5) does not exceed 3 N 2 . Thus (2. 4a), (2. 4b) together with 
(3.4), (3.5) provide us with an 0(N 2 ) method for computing the whole eigensystem of an 
N x JV ST matrices. 

The error analysis of the proposed method depends on two ingredients: 


1. The accuracy of the input data for (3.5a), (3.5b) — the eigenvalues 

An = diag(A^), A^ 2 ) . . . A^), and the two successive rows pffl, p^ +1 ^ of P^. This is 
determined by the stable behavior of the DC algorithm which was carefully analyzed 
in [3] [5]. Here we note that an accurate solution of the updating problems (2.4b) is 
essential for the stable behaviour of the DC algorithm. In Section 5 we discuss an 
improved version of the root finder [2] recommended by Cuppen [3] , which accurately 
computes the eigenvalues A# as the roots of the secular (characteristic) equation [6] 
associated with the ROM matrix D N + oznZ 1 n in (2.3b). 
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2. The second source of error is due to accumulation of rounding errors in the recurrence 
relations (3.5a), (3.5b). In order to examine this error accumulation, we rewrite (3.5) 
as a one-step iteration 


(3.6a) [Pn +1 \p$] — [P$ > Pn ^ 


**,«- 1 T 

L. 


— tijlrf] In 

On 


i = m + 1 . . . N — 1, 


(3.66) [p\f \pjv ] — [py,A + r — 


- t iti I N ] In , * = rn, m - 1 . . . 2 . 

l— — J 


An indication to stability properties of (3.6a), (3.6b) is provided by the eigenvalues, k = /c- 
of the two 2 N x 2 N matrices in the right-hand sides, i.e., for t = m+l,m + 2...iV — 1 
we have 


(3.7 a) *,•,«+ i (/c j) 2 - (Ay - £,.,•) /cj + = 0, j = 1,2... N 

and for * = m, m — 1 ... 2 we have 

(3.76) <m-i(«J) 2 “ (Ay — ti,i) K fj + £»,«+i = 0 , j = 1,2... N . 

Hence the error in the t-th iteration of (3.5) is amplified by a factor of at least 

^ (<) = i^(|/ct.|,|/cr.|). 

Thus, the method is expected to be stable if 

(3.8) II 9 {i) = II max (1^1, |/c7.|) < Const. 

t=2 t=2 

As a canonical example for such stable behavior, let us consider ST matrices whose 
entries are ‘slowly varying’ along the diagonals, i.e., £ t) y ~ £,+ i,y+i. Now, if the superdiag- 
onal entries are properly scaled so that also £, it +i ~ i, then by Gershgorin estimate we 
have for any 1 < j < N, 

| Ay — £,,,■ | 2 ~ (| £,-,,•- 1 1 + |£.-,«+i|) 2 ~ 4 1 | | 1 1 , 
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and hence the product of the characterisitc roots tcf, is of order unity, for 


i ±,2 |(^i 2 < f,- Tl 

\ K ij I = I 7T+ 1 ~ I: 1 ~ 1 • 

W,*±l 

If on the other hand (3.8) fails, we have an unstable error growth at the amount 

N - 1 

II ff'*' 1, as confirmed by the numerical examples demonstrated in Section 6. Typi- 

i=2 

cally, such an instability shows up by the loss of orthogonality between the computed rows 
Pn of P N . Hence, one approach to solve the stability problem would be to use reorthogo- 
nalization, once the instability was detected by the loss of orthogonality, consult [3, Section 
3], An alternative approach to overcome the instability problem, which better suits the 
proposed method, is to restart the recurrence relations (3.5) at the current iteration with 
two new successive rows of Pn. How should we obtain such two successive rows to restart 


with? Consider for example the N = 4m-dimensional problem. The iteration before the 


last of (2.4a) provides us with two ^-dimensional unit vectors, say zn and wn, where 

* 2 2 


(3.9a) 


(3.96) 
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l- 1 O 
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V 

N 

1 

^3 


» 

O • 

5 

1 


r «| 

t 

Pol 


r n 

pi 3 ) : 
r N 

4 


1 


«e» 

4 

: P (4) 

• r N 

4 


, 

iH O 

i 

5 

(2) 

wV 

4 



) 


= y/2wN_ . 
2 


As before, we obtain the m and m + 1 rows of P^P as 

2 


(3.10a) 


= (4 

3 4 4 


(i) 

2 


l 1 ." 1 ) _ (M) 

N — [Z 
2 

P ( i' m+11 = (o tt> *2>)‘Q ( i> 

2 4 4 2 


and the m and m + 1 rows of P$ as 

T 


(3.106) 


= (u>/£ , 0nYQ$ 

2 4 4 2 

p ( r +i ' = (o*,. 4ws ■ 


10 



Consequently, we can compute with 0(N 2 ) operations rows number m, m + 1, 3m, and 
3m + 1 of Pn, for by (2.3c) we have 

pST’ = {p { k"\o«)’Q„ 

2 3 

p'n +1) = (p < »” +1) ,0 2 .)'Q k 

(3.11) 

pSt> = ( o f ,p ( | H )'Q W 
P U’ n+1) = (o», P t'" +1) )‘QN 

In a similar manner one can restart the recurrence relations (3.5) at any desired iteration. 
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4. THE EIGENVECTORS OF T N - ONE AT THE TIME 


In the previous section we discussed an 0(N 2 ) method for computing the whole eigen- 
system - eigenvalues and eigenvectors of an N x N ST matrix. In several applications 
one is interested in only a few of the eigenvectors of Tjv. We now present a variant of 
this method which enables one to compute each one of the desired eigenvectors with O(N) 
operations. 

As before, we first prepare, with the help of the DC algorithm (2.4a), (2.4b), (3.4), the 
eigenvalues {A^}^ and the two middle successive rows, pffl and p^ n+1 ^ of P^. This can 
be done at the expense of 0(N 2 ) operations, and in many practical cases with even less. 
Equipped with this we can compute the eigenvector x^ — {x^ l \x^ . . . x^) corresponding 
to the eigenvalue, say, A^ 

(4.1) TffXff = X^Xff . 

Equation (4.1) gives us the linear three term recurrence relations between the compo- 
nents of x 

(4.2) t it i- ix (,_1) + t,,iX (,) + t M+ 1 x (,+1) = A (,) x (,) . 

Since x# coincides with the j-th column of Pn, we have its two middle entries x^ and 
x ( m +i) f rom ^^0 j'_th entries of pffl and p^T +1 \ The rest of the entries are computed 
recursively with 3 N operations by 


(4.3a) 

x “ +1) = T~ [< A > - 

W,i+1 L 

i)x (t ) - ti'i- ix ( * 1J ] , t = m + 1 . . . N - 1 , 

(4.36) 

*«-> = [(A,. - ( 

U t i- i 1 

i,i)x^ — t iti+ ix {t+1) ] , i = m, m — 1 ... 2 . 


The computation is stable or unstable depending on whether 
(4.4) nmax((4|,ra 

i=2 

is either bounded or >• 1 . 
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5. SOLUTION OF THE UPDATING PROBLEM 


In this section we follow [l] and [3] in a discussion on the promised 0(iV 2 ) method 
for solving the updating problem (2.3b), i.e., computing the eigensystem of + oz^z l N . 
Without loss of generality we may assume that a > 0 and that the problem has been 
deflated, so that the components of zn = (z^ ...z^)* as well as the difference between 
any two diagonal entries of J9jy = diag(du < d 2 2 < • • • dj\w) are different from zero (in 
practice we take a neighbourhood of zero with a preassigned tolerance, say e), consult [l], 
[3], and [4, Section 4]. In such case, it follows that the eigenvalues of the updating problem 
AW,* = 1,2... iV, strictly interlace with those of Djf [1, Theorem 1], [3, Theorem 2.1] 

(5.1) du < A^ < dn < A* 2 ) < • • • < A^ < djfN + o = djv+i.w+i 

With this in mind we now turn to compute the required eigenvalues, A = A^, as the 
roots of the so called secular equation [6] 

N (*U)\ 2 

(5.2) /(A) =l + aJ2 j _ = 0 . 

j = i a ii A 

The function /(A) is the rational representation of the characteristic polynomial associated 
with Dn + azffZff, and the interlacing property ensures that / has N simple roots, A^, 
lying in the open intervals (d„-, d t+lt+1 ), t = l,2...iV. We shall mention two zero-finders 
which have been advocated to find these roots: 

1. The zero-finder proposed by Bunch et al. [1] which is based on rational interpolation, 
employs the values of /(A) and its first derivative, /'( A). The advantage of this zero- 
finder (which will be referred to as ‘zeroinder’) is that it produces a monotonic 
sequence of approximations in (d,-,-,d, +1( , + i) which converges quadratically to A^. 
However, it is very sensitive near the ends of the intervals, (d,-,-,d, +li , +1 ), where the 
derivative involved, /'( A), become singular. 

2. Cuppen [3] advocated the ‘zeroinrat’ zero-finder of Bus and Dekker [2] which is 
based on rational interpolation of three /-values in the interval (d t - t| d t+ i it+ x). This 
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algorithm is more robust than the ‘zeroinder’, for it does not involve /'(A); conse- 
quently, it avoids the previous difficulty of singular derivatives near da and moreover, 
it saves half the operations per iteration. Yet, the current ‘zeroinrat’ algorithm lacks 
the monotonicity property we had before, and therefore, it requires safeguarding to 
ensure that we remain within the desired interval (this decreases the convergence 
rate to ~ 1.839). 


Asstiming that either one of these zero-finders requires no more than a constant number 
of iterations to compute (with some preassigned tolerable accuracy) each root of (5.2), then 
the required eigenvalues A^,t = 1,2... TV, are obtained by 0(7V 2 ) operations. Equipped 
with these eigenvalues we now may use the Sherman-Morrison formula to compute the 
associated normalized eigenvectors, q^\ which form the column vectors of Qx in (2.3b), 
as [1, Section 4] 


(5.3) 


(i) _ (Dn ~ A ^In) * z n 
,w “ ||(D„ - AW/ J ,)-i*„|| ’ 


» = 1...N , 


and the total operations cost does not exceed 0(7V 2 ), as asserted. 

To enhance the stability properties of the whole DC algorithm, the updating problem 
should be solved with maximum accuracy. To achieve this, we now present an efficient 
implementation for the solution of this problem, based on the ingredients described above. 

As a first step we reformulate (5.2) in a manner suggested in [l]. By the interlacing 
property (5.1) we have 


N 


(5.4) 


A<’> = d„ + , 0 < (.« < 1 , Y. ** (0 = 1 • 


»=1 


For i = 1,2, •••,7V we make the change of variables, A = da + an, so that instead of 
/ = /(A), we now obtain TV different rational functions, W,(/z), 

dim d;i 


(5.5) 


A (z {i) Y 
Wi{n) = i + £ 


c _ 

Oji = 


i= 1,2... TV, 


“1 ~ A* ' o 

each of which has a simple root, n = in the open interval (<5„- = 0,<5 f+ i >t ). Computing 
the root of Wi(fi) in this interval - rather than the root of /(A) in the (da, d, + i,,+i) interval 
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- has the advantage that W-(fi) is uniformly bounded from below (by 1) rather than 
having /'(A) > A, as in [3, Theorem 3.1]. The computation of the desired root proceeds 
by carefully monitoring a mixture of the two zero-finders mentioned above. Namely, the 
‘zeroinder’ algorithm will be used when we are well inside the interval of interest (0, 
while we switch to the ‘zeroinrat’ algorithm when we approach either end of this interval. 
To decide upon the switching policy we first quote from [5] (see [4]). 


LEMMA 5.1. Assume that the deflation test |z^| > e is satisfied. Then either we have 

c 2 

(5.6a) 

or alternatively 
(5.66) 


a 2 (2 + 6 t+M ) <5, ' +1,i - /x(,) - 2* +w 


2* +U - ^ ~ l 1 «r*(2 + «,+,.,)) * +W 


t 2 (2 + 6,+i.i) 

The bounds on the left and right-hand-sides of (5.6) yield a closed subinterval 

which encloses ^ . (Experiments have shown that these bounds may actually be achieved) . 

A more practical indication to the location of n W is obtained from the following con- 
siderations. The rational function 

(* (i) ) 2 


(5.7a) 


, N M) 2 ( 2 (*+l))2 

Vi(fj) = Const,- + - — — + ' 


Const,- = ^2 


-fj, 6 i+u - n ’ ’ .£? +1 6ji - < 5 ,- +m 

has a simple root, in the interval (0, 6, +1 ,-), Since Vi(fi) dominates Wi(n) in that 
interval and they are both monotonically increasing, we can use this root (which is found 
by solving a simple quadratic equation) as a lower bound for Similarly, the function 

M*')p fe(*' +1 h 2 ^ (z ^) 2 

(5.76) Ui{n) — Const< + — + — , Const,- = c — 

~ f i ®»+M “ ^ ;#»,«+ 1 

has a simple root, h^\ in the interval (0, 6,- +li ,-) which may serve as an upper bound for 




(0 


Returning to our problem of finding the roots of Wi(fi) in (5.5), we use the ‘zeroinder’ 
algorithm when inside the (0,6, •+!_,) interval. This requires us to compute W-(n), and 
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Lemma 5.1 indicates that as we approach either end of the interval, the computation of 
involves factors of e~ 4 which will lead to an underflow problem. To avoid this 
situation we use a switching policy, which in each step tests if either one of the following 
inequalities is satisfied 

(5.8) < L® , fcW > (•') , fcW < £(«') 


as an indication that we are in the neighborhood of the singular ends, in which case we use 
the ‘zeroinrat’ algorithm instead. This ‘switching’ policy enabled us to achieve with the 
usual 64-bit arithmetic, more than satisfactory results that otherwise would have required 
the less attractive extended precision arithmetic. 

Concerning the computation of the eigenvectors in (5.3), we note that it is possible to 
have severe round-off when AW is close to d„ or d,+i,t+i [3, Section 2]. The reformulation 
of the eigenvalue problem in (5.5) enables one to avoid half of these round-off problems, 
namely when A^) is close to d„-. Indeed, the normalized eigenvectors, < 7 $, are now given 
by 


(5.9) 


JO 

hn 




= diag(«ii 1^1 


Using (5.9) instead of (5.3) avoids cancellation which arises when AW is too close to da, 
i.e., when /z^) is too close to zero, for = 0 in this case. We are still left with the other 
half of the cancellation problem when A^) is too close to d,+i.i+i. If this is indeed the case 
(as we can foresee by computing the practical bounds for /^*) from (5.7a), (5.7b)), then we 
propose to perform yet another reformulation of our eigenvalue problem, using 


(5.10) 


A w = di+ i, i+ i - an [l) 


instead of (5.4). In this case the role of the rational functions W,(/z) in (5.5) is played by 

(5.11) lPiW = i + Er®-, = d “~ . »• = !.*•••» 

d ),i + 1 ~ r l a 
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each of which has a simple root rj = in the open interval (0, ^M+l) " The corresponding 

normalized eigenvectors are given by 


( 5 . 12 ) 


(0 [^V] "rdo c \ , _{*) r 

n = ^=nr, : — - » "jv - diag(d lif+1 . . . ^,.+i) + r ’In , 


q " = 


and cancellation which arises when is too close to d, +li , +1 , i.e., when r/M is too close 
to zero is avoided for ^i+i^+i = 0. 
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6. NUMERICAL EXPERIMENTS 


The main object of our experiments was a comparison between the standard 0(N 3 ) 
DC algorithm for computing the eigensystems of N x N ST matrices, and the proposed 
0(N 2 ) method in (2.4a), (2.4b) and (3.4), which makes use of the three-term recurrence 
relations (3.5a), (3.5b). The input data for these relations - the eigenvalues and the 
two initial successive row vectors P;T +1 \ were supplied with maximum accuracy, with 
the help of the updating solver described in Section 5 which avoids extended precision. 
Indeed all our calculations, including the pathologically ill-conditioned Wilkinson’s 
matrices, were carried out with a 64-bit arithmetic. 

The first set of results includes ‘well-behaved’ matrices taken from [7, (7.4)-(7.9)]. The 
entries along the diagonals of these matrices are ‘slowly varying’ and their eigenvalues are 
equally distributed. The stability analysis in Section 3 indicates bounded amplification 
factors in these cases, and the numerical results confirmed the expected stable behavior 
of our method. Table 1 summarizes the results for the prototype ST matrix of this group 
where = 2 — |t — j |. 



Standard DC Algorithm 

The Proposed Method 

N 

II TP - PA|U 


II TP - PA|U 

ll-P'-P - rib 

101 

2.5E-15 

6.2E-16 

9.5E-15 

7.2E-15 

201 

2.6E-15 

2.5E-15 

2.2E-14 

1.5E-14 

301 

3.0E-15 

2.8E-15 

2.9E-14 

8.8E-14 

i 

401 

1 

4.0E-15 

6.9E-15 

2.5E-13 

1.2E-13 


Table 1: Results for T[l,2,l] matrix of order N: 
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Since the rows of P were constructed by equating to zero rows 2, 3 ... N — 1 of TP — PA, 
the quantities on the left columns, || TP — PA||oo, stand for 

maxfll^ipW + fi, 2 p (2) - p (1) A||, ||^ i 7V-iP (JV_1) + t NiN pW - p (JV) A||) . 

They may serve us as a quantitive indication of the accumulation of rounding errors in the 
three-term recursion relations (3.5), which is responsible for the loss of (no more than) two 
orders of magnitude relative to the standard algorithm. The advantage of the proposed 
method lies upon the fact that the results on the right columns are obtained by saving 
order of magnitude in execution time relative to the results on the left columns. 

Next, we turn to the second group of matrices which consist of Wilkinson’s matrices, 
The superdiagonals of these matrices are properly scaled to begin with - they all 
equal one; the entries along the main diagonal, however, diag(^yi, . . . 1, 0, 1, . . . ^yi) 
are far from being ‘slowly varying’. This leads to amplification factors of the recurrence 
relations (3.5) of order ~ which indicates loss of all (64-bit precision) significant 

figures in computing the eigenproblem of of order N ~ 40. Moreover, the largest 

eigenvalues of W + s are clustered in pairs, which may be inseparable up to the 14 th decimal 
digit. This then leads to additional inaccuracies in the updating solution (while seeking 
two extremely close roots of the secular equation) as well as in the deflation process. As a 
result, the initial input data for the recurrence relation will also suffer from loss of accuracy. 
These arguments are well reflected in the following table: 
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Standard DC Algorithm 

The Proposed Method 

N 

II TP - PA|U 

ll p'p - /|i~ 

||TP - PA||oo 

Ii-p'j’ - 1 \\~ 

21 

4.5E-16 

2.5E-16 

1.2E-12 

9.8E-10 

41 

1.3E-15 

9.4E-16 

3.7E-8 

7.4E-7 

47 

2.0E-15 

9.1E-16 

5.3E-3 

1.0E-3 

49 

2.0E-15 

9.8E-16 

0.12 

0.23 


Table 2: Results for the W +N matrices. 

An attempt to improve the results of our method was made, in order to be competetive 
with the standard algorithm which gave excellent results for W+jv up to order N = 201. To 
this end we have appended our method with the restarting procedure described in Section 
3. Thus, by computing the row vectors (here m = PaT\ PaT + 1 ^» an( i p^ m+1 ^ 

as additional input data to restart the three term recurrence relations we were able to 
get decent results for the VF+Ar-matrices up to order N ~ 200. A finite number of such 
restarting procedures would enable us to deal with even larger W^-matrices, still within 
the 0(N 2 ) operations limit. 

Finally, the last group of matrices that were tested consists of randomly generated 
entries in [-1,1]. The results obtained are summarized in Table 3. 
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— 
Standard DC Algorithm 

The Proposed Method 

N 

II TP - PA|U 

l|f’ , P-/|U 

II TP - PA||„ 

iipv* - /||. 

100 

8.4E-15 

9.8E-16 

9.5E-15 

7.6E-15 

200 

5.9E-15 

3.4E-15 

6.2E-9 

9.8E-8 

300 

6.3E-15 

5.6E-15 

4.2E-4 

3.1E-2 

400 

7.2E-15 

6.8E-15 

0(1) 

0(1) 


Table 3: Results for random matrices of order N. 

We observe that excellent results are obtained by our method for such randomly generated 
matrices of order up to N ~ 100. If additional restarting procedures are employed every 
100 — 200 iterations, it would enable us to achieve highly accurate results for matrices of 
almost any practical size. 

In summary, we conclude that the proposed method for solving the eigenproblem of ST 
matrices, provides a competitive alternative to the standard eigensolvers for a wide class 
of such matrices; by sacrificing a few orders of accuracy, the method enables one to save 
order of magnitude in the total execution time. This conclusion was confirmed by further 
extensive numerical experiments reported in [5]. 
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